-
Notifications
You must be signed in to change notification settings - Fork 18
Rsync and Fuse Issues #139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
MisterDA
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Could you squash the commits when you'll think this is ready for merging?
tmcgilchrist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor style suggestions but otherwise the change looks good.
We need an osx rsync stress test to validate we don't leak disk space.
lib/sandbox.macos.ml
Outdated
| if Lwt.is_sleeping proc then ( | ||
| match !proc_id with | ||
| if Lwt.is_sleeping proc then | ||
| let _ = match !proc_id with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand exactly what's happening here, but it looks wrong as you're not binding the Lwt promise, or am I reading that wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the binding and changed the function a bit as well.
|
The LWT sleep looks curious to me. Does the kill_all_descendants function
not wait for the command to terminate?
…On Wed, 18 Jan 2023 at 6:35 pm, Mark Elvers ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In lib/sandbox.macos.ml
<#139 (comment)>:
> @@ -127,12 +127,11 @@ let run ~cancelled ?stdin:stdin ~log (t : t) config result_tmp =
in
Lwt.on_termination cancelled (fun () ->
let aux () =
- post_cancellation ~result_tmp t >>= fun () ->
- if Lwt.is_sleeping proc then (
- match !proc_id with
+ if Lwt.is_sleeping proc then
+ let _ = match !proc_id with
I have updated the binding and changed the function a bit as well.
—
Reply to this email directly, view it on GitHub
<#139 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABJXOPLD6BHOXR3QHVXLZTWS6MNRANCNFSM6AAAAAAT34STOY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
|
|
I've looked in depth, and I'm convinced that we're correctly waiting until all |
lib/sandbox.macos.ml
Outdated
| ) | ||
| else Lwt.return_unit (* Process has already finished *) | ||
| else Lwt.return_unit >>= fun () -> | ||
| post_cancellation ~result_tmp t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is that the indentation of post_cancellation is confusing: it is actually part of the else block here. If you want to execute it regardless of the Lwt.is_sleeping proc, you have to enclose the if … else … block in a begin … end (or parenthesis) block.
I think that's what you want.
open Lwt.Infix
let () =
let x = ref true in
Lwt_main.run begin
if !x then
Lwt_io.print "hello"
else
Lwt_io.print "world"
>>= fun () ->
Lwt_io.print "else branch"
end
(* prints "hello" *)
let () =
let x = ref true in
Lwt_main.run begin
begin if !x then
Lwt_io.print "hello"
else
Lwt_io.print "world"
end >>= fun () ->
Lwt_io.print " common"
end
(* prints "hello common" *) You could configure your editor to use ocp-indent (since we're not applying ocamlformat here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have an updated version with that modification which I haven't pushed yet.
lib/sandbox.macos.ml
Outdated
| post_build ~result_dir ~home_dir t >>= fun () -> | ||
| Lwt.return r | ||
| in | ||
| let () = Lwt.on_failure cancelled (fun _ -> let _ = unmount_fuse t in () ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned promise most likely needs to be passed to Lwt.async to be executed by Lwt, just like the function passed to Lwt.on_termination below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that on_failure can be removed as on_termination covers both cases.
|
Thanks @MisterDA. Right now, the issue I am seeing is that when a job is cancelled, a new job is started immediately, but new job appears to run concurrently with the on_termination code. For a normal worker, this isn't a problem because jobs can run concurrently, but on the Mac, the new job is starting as the old one is still finishing both using the same home directory. |
ae9cdaa to
fc2a064
Compare
|
What's the state of this PR now? Are you satisfied with your changes? |
89e84ac to
ccd3f2f
Compare
|
Merged as of 66f5164, thanks! |
- Updates to address rsync and sandbox issues (ocurrent/obuilder#139) - Minor updates.
- Updates to address rsync and sandbox issues (ocurrent/obuilder#139) - Minor updates.
CHANGES: - Updates to address rsync and sandbox issues. (@mtelvers ocurrent/obuilder#139, reviewed by @tmcgilchrist and @MisterDA) - Add an obuilder clean command to clean all build results. (@MisterDA ocurrent/obuilder#140, reviewed by @tmcgilchrist) - Make rsync-mode mandatory when using rsync store. (@tmcgilchrist ocurrent/obuilder#132, reviewed by @kit-ty-kate and @MisterDA)
- Updates to address rsync and sandbox issues (ocurrent/obuilder#139) - Minor updates.
The
rsyncfolderresult-tmpgrows uncontrollably, eventually leading to the worker stalling on with the following error:This is caused by the following two considitons:
result-tmpfolder, andrsyncbackend does not delete theresult-tmp.Furthermore, under normal circumstances, the fuse file system is unmounted between jobs. This is essential for the Docker health check to be able to find the Docker binary which is installed in
/usr/local. However, when a job is cancelled, fuse is not unmounted; therefore, we see the health check fail if it occurs immediately after a cancelled job.